Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces

نویسندگان

  • Juan Carlos Santamaría
  • Richard S. Sutton
  • Ashwin Ram
چکیده

A key element in the solution of reinforcement learning problems is the value function The purpose of this function is to measure the long term utility or value of any given state The function is important because an agent can use this measure to decide what to do next A common problem in reinforcement learning when applied to systems having continuous states and action spaces is that the value function must operate with a domain consisting of real valued variables which means that it should be able to represent the value of in nitely many state and action pairs For this reason function approximators are used to represent the value function when a close form solution of the optimal policy is not available In this paper we extend a previously proposed reinforcement learning algorithm so that it can be used with function approximators that generalize the value of individual experiences across both state and action spaces In particular we discuss the bene ts of using sparse coarse coded function approximators to represent value functions and describe in detail three implementations CMAC instance based and case based Additionally we discuss how function approximators having di erent degrees of resolution in di erent regions of the state and action spaces may in uence the performance and learning e ciency of the agent We propose a simple and modular technique that can be used to implement function approximators with non uniform degrees of resolution so that it can represent the value function with higher accuracy in important regions of the state and action spaces We performed extensive experiments in the double integrator and pendulum swing up systems to demonstrate the proposed ideas Kewords Reinforcement learning function approximation memory based methods continu ous domains optimal control resource preallocation

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in Continuous State and Action Spaces

Many traditional reinforcement-learning algorithms have been designed for problems with small finite state and action spaces. Learning in such discrete problems can been difficult, due to noise and delayed reinforcements. However, many real-world problems have continuous state or action spaces, which can make learning a good decision policy even more involved. In this chapter we discuss how to ...

متن کامل

Operation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm

: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...

متن کامل

Hierarchical Policy Gradient Algorithms

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning (PGRL) methods have received recent attention as a means to solve problems with continuous state spaces. However, they suffer from slow convergence. In this paper, we combine these two approaches and propose a family ...

متن کامل

Improved State Aggregation with Growing Neural Gas in Multidimensional State Spaces

Q-Learning is a widely used method for dealing with reinforcement learning problems. However, the conditions for its convergence include an exact representation and sufficiently (in theory even infinitely) many visits of each state-action pair—requirements that raise problems for large or continuous state spaces. To speed up learning and to exploit gained experience more efficiently it is highl...

متن کامل

Continuous-action reinforcement learning with fast policy search and adaptive basis function selection

As an important approach to solving complex sequential decision problems, reinforcement learning (RL) has been widely studied in the community of artificial intelligence and machine learning. However, the generalization ability of RL is still an open problem and it is difficult for existing RL algorithms to solve Markov decision problems (MDPs) with both continuous state and action spaces. In t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Adaptive Behaviour

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1997